A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
نویسندگان
چکیده
BACKGROUND In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. METHODS In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. RESULTS If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the response data are generated from a discrete multinomial distribution with support on (0,1). CONCLUSIONS The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.
منابع مشابه
Author's response to reviews Title:A Monte Carlo Simulation Study Comparing Linear Regression, Beta Regression, Variable-Dispersion Beta Regression and Fractional Logit Regression at Recovering Average Difference Measures in a Two Sample Design Authors:
متن کامل
Erratum to: A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
Erratum After publication of the original article [1], the authors noticed an error in Fig. 1. The legend included in the original sub-plot of Fig. 1 was labelled “phi = 500 (p = 25, q = 475)”; however, the figure title suggested phi = 1000. An updated version of Fig. 1 is published in this erratum, where the legend has been updated to “phi = 1000 (p = 50, q = 950)” to be consistent with the fi...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملFinite Sample Properties of Quantile Interrupted Time Series Analysis: A Simulation Study
Interrupted Time Series (ITS) analysis represents a powerful quasi-experime-ntal design in which a discontinuity is enforced at a specific intervention point in a time series, and separate regression functions are fitted before and after the intervention point. Segmented linear/quantile regression can be used in ITS designs to isolate intervention effects by estimating the sudden/level change (...
متن کاملA Monte Carlo simulation technique for assessment of earthquake-induced displacement of slopes
The dynamic response of slopes against earthquake is commonly characterized by the earthquake-induced displacement of slope (EIDS). The EIDS value is a function of several variables such as the material properties, slope geometry, and earthquake acceleration. This work is aimed at the prediction of EIDS using the Monte Carlo simulation method (MCSM). Hence, the parameters height, unit specific ...
متن کامل